Single Character Chinese Named Entity Recognition
نویسندگان
چکیده
Single character named entity (SCNE) is a name entity (NE) composed of one Chinese character, such as “ ” (zhong1, China) and “ ” e2,Russia . SCNE is very common in written Chinese text. However, due to the lack of in-depth research, SCNE is a major source of errors in named entity recognition (NER). This paper formulates the SCNE recognition within the sourcechannel model framework. Our experiments show very encouraging results: an Fscore of 81.01% for single character location name recognition, and an F-score of 68.02% for single character person name recognition. An alternative view of the SCNE recognition problem is to formulate it as a classification task. We construct two classifiers based on maximum entropy model (ME) and vector space model (VSM), respectively. We compare all proposed approaches, showing that the sourcechannel model performs the best in most cases.
منابع مشابه
Character Language Models for Chinese Word Segmentation and Named Entity Recognition
We describe the application of the LingPipe toolkit (Alias-i 2006) to Chinese word segmentation and named entity recognition. We provide results for the third SIGHAN Chinese language processing bakeoff (Levow 2006). F1 measures on the best performing corpora were .972 for word segmentation and .855 for person/location/organization named-
متن کاملChinese Word Segmentation and Named Entity Recognition by Character Tagging
This paper describes our word segmentation system and named entity recognition (NER) system for participating in the third SIGHAN Bakeoff. Both of them are based on character tagging, but use different tag sets and different features. Evaluation results show that our word segmentation system achieved 93.3% and 94.7% F-score in UPUC and MSRA open tests, and our NER system got 70.84% and 81.32% F...
متن کاملChinese Named Entity Recognition with a Multi-Phase Model
Chinese named entity recognition is one of the difficult and challenging tasks of NLP. In this paper, we present a Chinese named entity recognition system using a multi-phase model. First, we segment the text with a character-level CRF model. Then we apply three word-level CRF models to the labeling person names, location names and organization names in the segmentation results, respectively. O...
متن کاملImprovement of Chemical Named Entity Recognition through Sentence-based Random Under-sampling and Classifier Combination
Chemical Named Entity Recognition (NER) is the basic step for consequent information extraction tasks such as named entity resolution, drug-drug interaction discovery, extraction of the names of the molecules and their properties. Improvement in the performance of such systems may affects the quality of the subsequent tasks. Chemical text from which data for named entity recognition is extracte...
متن کاملRecognizing named entities in spoken Chinese dialogues with a character-level maximum entropy tagger
Named Entity Recognition (NER) is an important task in information extraction, where major attention has been paid to written texts of a news or academic paper (esp. biomedical) style. In this paper we report the first piece of work on NER in spoken Chinese dialogues, as a preliminary step for spoken language understanding. The NER task is taken as a sequential classification problem and solved...
متن کامل